Sensitive Micro Data Protection Using Latin Hypercube Sampling Technique

نویسندگان

  • Ramesh A. Dandekar
  • Michael Cohen
  • Nancy Kirkendall
چکیده

We propose use of Latin Hypercube Sampling to create a synthetic data set that reproduces many of the essential features of an original data set while providing disclosure protection. The synthetic micro data can also be used to create either additive or multiplicative noise which when merged with the original data can provide disclosure protection. The technique can also be used to create hybrid micro data sets containing pre-determined mixtures of real and synthetic data. We demonstrate the basic properties of the synthetic data approach by applying the Latin Hypercube Sampling technique to a database supported a by the Energy Information Administration. The use of Latin Hypercube Sampling, along with the goal of reproducing the rank correlation structure instead of the Pearson correlation structure, has not been previously applied to the disclosure protection problem. Given its properties, this technique offers multiple alternatives to current methods for providing disclosure protection for large data sets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Asymptotically Valid Confidence Intervals for Quantiles and Values-at-Risk When Applying Latin Hypercube Sampling

Quantiles, which are also known as values-at-risk in finance, are often used as risk measures. Latin hypercube sampling (LHS) is a variance-reduction technique (VRT) that induces correlation among the generated samples in such a way as to increase efficiency under certain conditions; it can be thought of as an extension of stratified sampling in multiple dimensions. This paper develops asymptot...

متن کامل

Commodity price uncertainty propagation in open-pit mine production planning by Latin hypercube sampling method

Production planning of an open-pit mine is a procedure during which the rock blocks are assigned to different production periods in a way that leads to the highest net present value (NPV) subject to some operational and technical constraints. This process becomes much more complicated by incorporation of the uncertainty existing in the input parameters. The commodity price uncertainty is among ...

متن کامل

USING LATIN HYPERCUBE SAMPLING BASED ON THE ANN-HPSOGA MODEL FOR ESTIMATION OF THE CREATION PROBABILITY OF DAMAGED ZONE AROUND UNDERGROUND SPACES

The excavation damaged zone (EDZ) can be defined as a rock zone where the rock properties and conditions have been changed due to the processes related to an excavation. This zone affects the behavior of rock mass surrounding the construction that reduces the stability and safety factor and increase probability of failure of the structure. In this paper, a methodology was examined for computing...

متن کامل

Progressive Latin Hypercube Sampling: An efficient approach for robust sampling-based analysis of environmental models

Efficient sampling strategies that scale with the size of the problem, computational budget, and users’ needs are essential for various sampling-based analyses, such as sensitivity and uncertainty analysis. In this study, we propose a new strategy, called Progressive Latin Hypercube Sampling (PLHS), which sequentially generates sample points while progressively preserving the distributional pro...

متن کامل

A conditioned Latin hypercube method for sampling in the presence of ancillary information

This paper presents the conditioned Latin hypercube as a sampling strategy of an area with prior information represented as exhaustive ancillary data. Latin hypercube sampling (LHS) is a stratified random procedure that provides an efficient way of sampling variables from their multivariate distributions. It provides a full coverage of the range of each variable by maximally stratifying the mar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002